Linear programming-based solution methods for constrained partially observable Markov decision processes

نویسندگان

چکیده

Constrained partially observable Markov decision processes (CPOMDPs) have been used to model various real-world phenomena. However, they are notoriously difficult solve optimality, and there exist only a few approximation methods for obtaining high-quality solutions. In this study, grid-based approximations in combination with linear programming (LP) models generate approximate policies CPOMDPs. A detailed numerical study is conducted six CPOMDP problem instances considering both their finite infinite horizon formulations. The quality of algorithms solving unconstrained POMDP problems established through comparative analysis exact solution methods. Then, the performance LP-based approaches varying budget levels evaluated. Finally, flexibility demonstrated by applying deterministic policy constraints, investigation into impact on rewards CPU run time provided. For most problems, constraints found little expected reward, but introduce significant increase time. reverse observed: tend yield lower total than stochastic counterparts, negligible case. Overall, these results demonstrate that LP can effectively while providing incorporate additional underlying model.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Approximate Linear Programming for Constrained Partially Observable Markov Decision Processes

In many situations, it is desirable to optimize a sequence of decisions by maximizing a primary objective while respecting some constraints with respect to secondary objectives. Such problems can be naturally modeled as constrained partially observable Markov decision processes (CPOMDPs) when the environment is partially observable. In this work, we describe a technique based on approximate lin...

متن کامل

Eecient Dynamic-programming Updates in Partially Observable Markov Decision Processes Eecient Dynamic-programming Updates in Partially Observable Markov Decision Processes

We examine the problem of performing exact dynamic-programming updates in partially observable Markov decision processes (pomdps) from a computational complexity viewpoint. Dynamic-programming updates are a crucial operation in a wide range of pomdp solution methods and we nd that it is intractable to perform these updates on piecewise-linear convex value functions for general pomdps. We offer ...

متن کامل

Partially observable Markov decision processes

For reinforcement learning in environments in which an agent has access to a reliable state signal, methods based on the Markov decision process (MDP) have had many successes. In many problem domains, however, an agent suffers from limited sensing capabilities that preclude it from recovering a Markovian state signal from its perceptions. Extending the MDP framework, partially observable Markov...

متن کامل

Eecient Dynamic-programming Updates in Partially Observable Markov Decision Processes

متن کامل

Bounded-Parameter Partially Observable Markov Decision Processes

The POMDP is considered as a powerful model for planning under uncertainty. However, it is usually impractical to employ a POMDP with exact parameters to model precisely the real-life situations, due to various reasons such as limited data for learning the model, etc. In this paper, assuming that the parameters of POMDPs are imprecise but bounded, we formulate the framework of bounded-parameter...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Applied Intelligence

سال: 2023

ISSN: ['0924-669X', '1573-7497']

DOI: https://doi.org/10.1007/s10489-023-04603-7